Duplicate code detection using anti-unification

نویسندگان

  • Peter Bulychev
  • Marius Minea
چکیده

This paper describes a new algorithm for finding software clones. It is conceptually independent of the source language of the analyzed programs, working at the level of abstract syntax trees. The algorithm considers that two sequences of statements form a clone if one of them can be obtained from the other by replacing some subtrees. To our knowledge this notion was not previously employed in the literature. It allows to take into account all information on the syntactic structure of a program. We have implemented this algorithm in the tool Clone Digger. It currently supports the Python and Java languages. Clone Digger is free and provided under the GPL license.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An evaluation of duplicate code detection using anti-unification

This paper describes an algorithm for finding software clones, which works at the level of abstract syntax trees and is thus conceptually independent of the source language of the analyzed programs. We use a notion of clones which captures replacement of subtrees in the program AST, and is formally based on the notion of anti-unification. This allows us to capture syntactic structural similarit...

متن کامل

Term-Graph Anti-Unification∗

We study anti-unification for possibly cyclic, unranked term-graphs and develop an algorithm, which computes a minimal complete set of least general generalizations for them. For bisimilar graphs the algorithm computes the join in the lattice generated by a functional bisimulation. Besides, we consider the case when the graph edges are not ordered (modeled by commutativity). These results gener...

متن کامل

Sixth International Symposium on Symbolic Computation in Software Science

Generalization problems arise in many areas of software science: code clone detection, program reuse, partial evaluation, program synthesis, invariant generation, etc. Anti-unification is a technique used often to solve generalization problems. In this paper we describe an open-source library of some newly developed anti-unification algorithms in various theories: for firstand second-order unra...

متن کامل

Improving the Unification of Software Clones Using Tree and Graph Matching Algorithms

Improving the Unification of Software Clones using Tree and Graph Matching Algorithms Giri Panamoottil Krishnan Code duplication is common in all kind of software systems and is one of the most troublesome hurdles in software maintenance and evolution activities. Even though these code clones are created for the reuse of some functionality, they usually go through several modifications after th...

متن کامل

Anti-unification Algorithms and Their Applications in Program Analysis

A term t is called a template of terms t1 and t2 iff t1 = tη1 and t2 = tη2, for some substitutions η1 and η2. A template t of t1 and t2 is called the most specific iff for any template t′ of t1 and t2 there exists a substitution ξ such that t = t ′ξ. The anti-unification problem is that of computing the most specific template of two given terms. This problem is dual to the well-known unificatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008